Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning
نویسندگان
چکیده
In the past several years, Materials Genome Initiative (MGI) efforts have produced myriad examples of computationally designed materials in the fields of energy storage, catalysis, thermoelectrics, and hydrogen storage as well as large data resources that are used to screen for potentially transformative compounds. The bottleneck in high-throughput materials design has thus shifted to materials synthesis, which motivates our development of a methodology to automatically compile materials synthesis parameters across tens of thousands of scholarly publications using natural language processing techniques. To demonstrate our framework’s capabilities, we examine the synthesis conditions for various metal oxides across more than 12 thousand manuscripts. We then apply machine learning methods to predict the critical parameters needed to synthesize titania nanotubes via hydrothermal methods and verify this result against known mechanisms. Finally, we demonstrate the capacity for transfer learning by using machine learning models to predict synthesis outcomes on materials systems not included in the training set and thereby outperform heuristic strategies.
منابع مشابه
Towards Pathway Curation Through Literature Mining - A Case Study Using PharmGKB
The creation of biological pathway knowledge bases is largely driven by manual effort to curate based on evidences from the scientific literature. It is highly challenging for the curators to keep up with the literature. Text mining applications have been developed in the last decade to assist human curators to speed up the curation pace where majority of them aim to identify the most relevant ...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملBiomedical Literature Mining for Pharmacokinetics Numerical Parameter Collection
BIOMEDICAL LITERATURE MINING FOR PHARMACOKINETICS NUMERICAL PARAMETER COLLECTION Model-based drug studies have been developing very fast recently. They require high quality pharmacokinetics (PK) parameter numerical data. However, most parameter measurements are still buried in the scientific literature. Traditional manual data extraction is too expensive to handle the exponentially growing numb...
متن کاملMachine learning based Visual Evoked Potential (VEP) Signals Recognition
Introduction: Visual evoked potentials contain certain diagnostic information which have proved to be of importance in the visual systems functional integrity. Due to substantial decrease of amplitude in extra macular stimulation in commonly used pattern VEPs, differentiating normal and abnormal signals can prove to be quite an obstacle. Due to developments of use of machine l...
متن کاملUsing Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کامل